Learning to Merge Word Senses

نویسندگان

Rion Snow

Sushant Prakash

Daniel Jurafsky

Andrew Y. Ng

چکیده

It has been widely observed that different NLP applications require different sense granularities in order to best exploit word sense distinctions, and that for many applications WordNet senses are too fine-grained. In contrast to previously proposed automatic methods for sense clustering, we formulate sense merging as a supervised learning problem, exploiting human-labeled sense clusterings as training data. We train a discriminative classifier over a wide variety of features derived from WordNet structure, corpus-based evidence, and evidence from other lexical resources. Our learned similarity measure outperforms previously proposed automatic methods for sense clustering on the task of predicting human sense merging judgments, yielding an absolute F-score improvement of 4.1% on nouns, 13.6% on verbs, and 4.0% on adjectives. Finally, we propose a model for clustering sense taxonomies using the outputs of our classifier, and we make available several automatically sense-clustered WordNets of various sense granularities.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Taxonomy Learning Using Word Sense Induction

Taxonomies are an important resource for a variety of Natural Language Processing (NLP) applications. Despite this, the current stateof-the-art methods in taxonomy learning have disregarded word polysemy, in effect, developing taxonomies that conflate word senses. In this paper, we present an unsupervised method that builds a taxonomy of senses learned automatically from an unlabelled corpus. O...

متن کامل

On Modeling Sense Relatedness in Multi-prototype Word Embedding

To enhance the expression ability of distributional word representation learning model, many researchers tend to induce word senses through clustering, and learn multiple embedding vectors for each word, namely multi-prototype word embedding model. However, most related work ignores the relatedness among word senses which actually plays an important role. In this paper, we propose a novel appro...

متن کامل

Unsupervised and Minimally Supervised Learning of Lexical Semantics Proceedings of the Workshop

Supervised word sense disambiguation requires training corpora that have been tagged with word senses, and these word senses typically come from a pre-existing sense inventory. Space limitations imposed by dictionary publishers have biased the field towards lists of discrete senses for an individual lexeme. This approach does not capture information about relatedness of individual senses. How i...

متن کامل

Joint Learning of Sense and Word Embeddings

Methods for learning lower-dimensional representations (embeddings) of words using unlabelled data have received a renewed interested due to their myriad success in various Natural Language Processing (NLP) tasks. However, despite their success, a common deficiency associated with most word embedding learning methods is that they learn a single representation for a word, ignoring the different ...

متن کامل

Corpus-based Ontology Learning for Word Sense Disambiguation

This paper proposes to disambiguate word senses by corpus-based ontology learning. Our approach is a hybrid method. First, we apply the previously-secured dictionary information to select the correct senses of some ambiguous words with high precision, and then use the ontology to disambiguate the remaining ambiguous words. The mutual information between concepts in the ontology was calculated b...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Learning to Merge Word Senses

نویسندگان

چکیده

منابع مشابه

Taxonomy Learning Using Word Sense Induction

On Modeling Sense Relatedness in Multi-prototype Word Embedding

Unsupervised and Minimally Supervised Learning of Lexical Semantics Proceedings of the Workshop

Joint Learning of Sense and Word Embeddings

Corpus-based Ontology Learning for Word Sense Disambiguation

عنوان ژورنال:

اشتراک گذاری